Discovering Visual Concept Structure with Sparse and Incomplete Tags

نویسندگان

Jingya Wang

Xiatian Zhu

Shaogang Gong

چکیده

Discovering automatically the semantic structure of tagged visual data (e.g. web videos and images) is important for visual data analysis and interpretation, enabling the machine intelligence for effectively processing the fast-growing amount of multi-media data. However, this is non-trivial due to the need for jointly learning underlying correlations between heterogeneous visual and tag data. The task is made more challenging by inherently sparse and incomplete tags. In this work, we develop a method for modelling the inherent visual data concept structures based on a novel Hierarchical-Multi-Label Random Forest model capable of correlating structured visual and tag information so as to more accurately interpret the visual semantics, e.g. disclosing meaningful visual groups with similar high-level concepts, and recovering missing tags for individual visual data samples. Specifically, our model exploits hierarchically structured tags of different semantic abstractness and multiple tag statistical correlations in addition to modelling visual and tag interactions. As a result, our model is able to discover more accurate semantic correlation between textual tags and visual features, and finally providing favourable visual semantics interpretation even with highly sparse and incomplete tags. We demonstrate the advantages of our proposed approach in two fundamental applications, visual data clustering and missing tag completion, on benchmarking video (i.e. TRECVID MED 2011) and image (i.e. NUS-WIDE) datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Video Semantic Clustering with Sparse and Incomplete Tags

Clustering tagged videos into semantic groups is important but challenging due to the need for jointly learning correlations between heterogeneous visual and tag data. The task is made more difficult by inherently sparse and incomplete tag labels. In this work, we develop a method for accurately clustering tagged videos based on a novel Hierarchical-MultiLabel Random Forest model capable of cor...

متن کامل

Visual Tagging Through Social Collaboration: A Concept Paper

Collaborative tagging has grown on the Internet as a new paradigm for web information discovering, ltering and retrieval. In the physical world, we use visual tags: labels readable by smartphones with cameras. While visual tags are usually related to a web site address, collaborative tagging, instead, provides updated, recommended information contributed and shared by users. In this paper we in...

متن کامل

Discovering Multi-Aspect Structure to Learn From Loosely Labeled Image Collections

Tremendous amounts of images readily available on the Web have simple annotations and tags. However, at best these tags can be considered as “loose labels”, since the richness of the visual information in the images far exceeds the description a few words can provide. While the coarse tags do not translate to a unified visual concept—making it difficult to directly use tagged images to train st...

متن کامل

Towards Complete, Geo-referenced 3d Models from Crowd-sourced Amateur Images

Despite a lot of recent research, photogrammetric reconstruction from crowd-sourced imagery is plagued by a number of recurrent problems. (i) The resulting models are chronically incomplete, because even touristic landmarks are photographed mostly from a few “canonical” viewpoints. (ii) Man-made constructions tend to exhibit repetitive structure and rotational symmetries, which lead to gross er...

متن کامل

Tags Re-ranking Using Multi-level Features in Automatic Image Annotation

Automatic image annotation is a process in which computer systems automatically assign the textual tags related with visual content to a query image. In most cases, inappropriate tags generated by the users as well as the images without any tags among the challenges available in this field have a negative effect on the query's result. In this paper, a new method is presented for automatic image...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Artif. Intell.

دوره 250 شماره

صفحات -

تاریخ انتشار 2017

Discovering Visual Concept Structure with Sparse and Incomplete Tags

نویسندگان

چکیده

منابع مشابه

Video Semantic Clustering with Sparse and Incomplete Tags

Visual Tagging Through Social Collaboration: A Concept Paper

Discovering Multi-Aspect Structure to Learn From Loosely Labeled Image Collections

Towards Complete, Geo-referenced 3d Models from Crowd-sourced Amateur Images

Tags Re-ranking Using Multi-level Features in Automatic Image Annotation

عنوان ژورنال:

اشتراک گذاری